Tags: machine learning* + llm*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. This article describes a workflow using Large Language Models (LLMs) to automate the process of normalising spreadsheet data, making it tidy and machine-readable for easier analysis and insights.
  2. A comprehensive guide to ultrascale machine learning, covering techniques, tools, and best practices.
  3. The attention mechanism in Large Language Models (LLMs) helps derive the meaning of a word from its context. This involves encoding words as multi-dimensional vectors, calculating query and key vectors, and using attention weights to adjust the embedding based on contextual relevance.
  4. Qodo-Embed-1-1.5B is a state-of-the-art code embedding model designed for retrieval tasks in the software development domain. It supports multiple programming languages and is optimized for natural language-to-code and code-to-code retrieval, making it highly effective for applications such as code search and retrieval-augmented generation.
  5. Qodo releases Qodo-Embed-1-1.5B, an open-source code embedding model that outperforms competitors from OpenAI and Salesforce, enhancing code search, retrieval, and understanding for enterprise development teams.
  6. SmolVLM2 represents a shift in video understanding technology by introducing efficient models that can run on various devices, from phones to servers. The release includes models of three sizes (2.2B, 500M, and 256M) with Python and Swift API support. These models offer video understanding capabilities with reduced memory consumption, supported by a suite of demo applications for practical use.
  7. The article delves into how large language models (LLMs) store facts, focusing on the role of multi-layer perceptrons (MLPs) in this process. It explains the mechanics of MLPs, including matrix multiplication, bias addition, and the Rectified Linear Unit (ReLU) function, using the example of encoding the fact that Michael Jordan plays basketball. The article also discusses the concept of superposition, which allows models to store a vast number of features by utilizing nearly perpendicular directions in high-dimensional spaces.
  8. Sawmills AI has introduced a smart telemetry data management platform aimed at reducing costs and improving data quality for enterprise observability. By acting as a middleware layer that uses AI and ML to optimize telemetry data before it reaches vendors like Datadog and Splunk, Sawmills helps companies manage data efficiently, retain data sovereignty, and reduce unnecessary data processing costs.
  9. The article explores the architectural changes that enable DeepSeek's models to perform well with fewer resources, focusing on Multi-Head Latent Attention (MLA). It discusses the evolution of attention mechanisms, from Bahdanau to Transformer's Multi-Head Attention (MHA), and introduces Grouped-Query Attention (GQA) as a solution to MHA's memory inefficiencies. The article highlights DeepSeek's competitive performance despite lower reported training costs.
  10. A comprehensive guide to Large Language Models by Damien Benveniste, covering various aspects from transformer architectures to deploying LLMs.

    - Language Models Before Transformers
    - Attention Is All You Need: The Original Transformer Architecture
    - A More Modern Approach To The Transformer Architecture
    - Multi-modal Large Language Models
    - Transformers Beyond Language Models
    - Non-Transformer Language Models
    - How LLMs Generate Text
    - From Words To Tokens
    - Training LLMs to Follow Instructions
    - Scaling Model Training
    - Fine-Tuning LLMs
    - Deploying LLMs

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "machine learning+llm"

About - Propulsed by SemanticScuttle